AITopics | deep learning architecture

Collaborating Authors

deep learning architecture

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework

Neural Information Processing SystemsJun-14-2026, 05:48:00 GMT

We introduce copresheaf topological neural networks (CTNNs), a powerful unifying framework that encapsulates a wide spectrum of deep learning architectures, designed to operate on structured data, including images, point clouds, graphs, meshes, and topological manifolds. While deep learning has profoundly impacted domains ranging from digital assistants to autonomous systems, the principled design of neural architectures tailored to specific tasks and data types remains one of the field's most persistent open challenges. CTNNs address this gap by formulating model design in the language of copresheaves, a concept from algebraic topology that generalizes most practical deep learning models in use today. This abstract yet constructive formulation yields a rich design space from which theoretically sound and practically effective solutions can be derived to tackle core challenges in representation learning, such as long-range dependencies, oversmoothing, heterophily, and non-Euclidean domains. Our empirical results on structured data benchmarks demonstrate that CTNNs consistently outperform conventional baselines, particularly in tasks requiring hierarchical or localized sensitivity. These results establish CTNNs as a principled multi-scale foundation for the next generation of deep learning architectures.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Neural Information Processing SystemsDec-26-2025, 03:31:13 GMT

We investigate the theoretical limits of pipeline parallel learning of deep learning architectures, a distributed setup in which the computation is distributed per layer instead of per example. For smooth convex and non-convex objective functions, we provide matching lower and upper complexity bounds and show that a naive pipeline parallelization of Nesterov's accelerated gradient descent is optimal. For non-smooth convex functions, we provide a novel algorithm coined Pipeline Parallel Random Smoothing (PPRS) that is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension. While the convergence rate still obeys a slow $\varepsilon^{-2}$ convergence rate, the depth-dependent part is accelerated, resulting in a near-linear speed-up and convergence time that only slightly depends on the depth of the deep learning architecture. Finally, we perform an empirical analysis of the non-smooth non-convex case and show that, for difficult and highly non-smooth problems, PPRS outperforms more traditional optimization algorithms such as gradient descent and Nesterov's accelerated gradient descent for problems where the sample size is limited, such as few-shot or adversarial learning.

parallel optimization and application, pipeline parallel optimization, theoretical limit, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Forecasting Future Anatomies: Longitudinal Brain Mri-to-Mri Prediction

Farki, Ali, Moradi, Elaheh, Koundal, Deepika, Tohka, Jussi

arXiv.org Artificial IntelligenceNov-24-2025

Predicting future brain state from a baseline magnetic resonance image (MRI) is a central challenge in neuroimaging and has important implications for studying neurodegenerative diseases such as Alzheimer's disease (AD). Most existing approaches predict future cognitive scores or clinical outcomes, such as conversion from mild cognitive impairment to dementia. Instead, here we investigate longitudinal MRI image-to-image prediction that forecasts a participant's entire brain MRI several years into the future, intrinsically modeling complex, spatially distributed neurodegenerative patterns. We implement and evaluate five deep learning architectures (UNet, U2-Net, UNETR, Time-Embedding UNet, and ODE-UNet) on two longitudinal cohorts (ADNI and AIBL). Predicted follow-up MRIs are directly compared with the actual follow-up scans using metrics that capture global similarity and local differences. The best performing models achieve high-fidelity predictions, and all models generalize well to an independent external dataset, demonstrating robust cross-cohort performance. Our results indicate that deep learning can reliably predict participant-specific brain MRI at the voxel level, offering new opportunities for individualized prognosis.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2511.02558

Country: North America > United States > California (0.29)

Genre: Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

XAI-Driven Deep Learning for Protein Sequence Functional Group Classification

Chakraborty, Pratik, Bhargava, Aryan

arXiv.org Artificial IntelligenceNov-19-2025

Proteins perform essential biological functions, and accurate classification of their sequences is critical for understanding structure-function relationships, enzyme mechanisms, and molecular interactions. This study presents a deep learning-based framework for functional group classification of protein sequences derived from the Protein Data Bank (PDB). Four architectures were implemented: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM), CNN-BiLSTM hybrid, and CNN with Attention. Each model was trained using k-mer integer encoding to capture both local and long-range dependencies. Among these, the CNN achieved the highest validation accuracy of 91.8%, demonstrating the effectiveness of localized motif detection. Explainable AI techniques, including Grad-CAM and Integrated Gradients, were applied to interpret model predictions and identify biologically meaningful sequence motifs. The discovered motifs, enriched in histidine, aspartate, glutamate, and lysine, represent amino acid residues commonly found in catalytic and metal-binding regions of transferase enzymes. These findings highlight that deep learning models can uncover functionally relevant biochemical signatures, bridging the gap between predictive accuracy and biological interpretability in protein sequence analysis.

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.13791

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Education > Health & Safety > School Nutrition (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Attention Augmented GNN RNN-Attention Models for Advanced Cybersecurity Intrusion Detection

Biradar, Jayant, Shah, Smit, Naik, Tanmay

arXiv.org Artificial IntelligenceOct-31-2025

In this paper, we propose a novel hybrid deep learning architecture that synergistically combines Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and multi-head attention mechanisms to significantly enhance cybersecurity intrusion detection capabilities. By leveraging the comprehensive UNSW-NB15 dataset containing diverse network traffic patterns, our approach effectively captures both spatial dependencies through graph structural relationships and temporal dynamics through sequential analysis of network events. The integrated attention mechanism provides dual benefits of improved model interpretability and enhanced feature selection, enabling cybersecurity analysts to focus computational resources on high-impact security events -- a critical requirement in modern real-time intrusion detection systems. Our extensive experimental evaluation demonstrates that the proposed hybrid model achieves superior performance compared to traditional machine learning approaches and standalone deep learning models across multiple evaluation metrics, including accuracy, precision, recall, and F1-score. The model achieves particularly strong performance in detecting sophisticated attack patterns such as Advanced Persistent Threats (APTs), Distributed Denial of Service (DDoS) attacks, and zero-day exploits, making it a promising solution for next-generation cybersecurity applications in complex network environments.

artificial intelligence, deep learning, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2510.25802

Country: North America > United States > Arizona > Pima County > Tucson (0.15)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Investigating the Impact of Rational Dilated Wavelet Transform on Motor Imagery EEG Decoding with Deep Learning Models

Siino, Marco, Bonomo, Giuseppe, Sorbello, Rosario, Tinnirello, Ilenia

arXiv.org Artificial IntelligenceOct-13-2025

The present study investigates the impact of the Rational Discrete Wavelet Transform (RDWT), used as a plug-in preprocessing step for motor imagery electroencephalographic (EEG) decoding prior to applying deep learning classifiers. A systematic paired evaluation (with/without RDWT) is conducted on four state-of-the-art deep learning architectures: EEGNet, ShallowConvNet, MBEEG\_SENet, and EEGTCNet. This evaluation was carried out across three benchmark datasets: High Gamma, BCI-IV-2a, and BCI-IV-2b. The performance of the RDWT is reported with subject-wise averages using accuracy and Cohen's kappa, complemented by subject-level analyses to identify when RDWT is beneficial. On BCI-IV-2a, RDWT yields clear average gains for EEGTCNet (+4.44 percentage points, pp; kappa +0.059) and MBEEG\_SENet (+2.23 pp; +0.030), with smaller improvements for EEGNet (+2.08 pp; +0.027) and ShallowConvNet (+0.58 pp; +0.008). On BCI-IV-2b, the enhancements observed are modest yet consistent for EEGNet (+0.21 pp; +0.044) and EEGTCNet (+0.28 pp; +0.077). On HGD, average effects are modest to positive, with the most significant gain observed for MBEEG\_SENet (+1.65 pp; +0.022), followed by EEGNet (+0.76 pp; +0.010) and EEGTCNet (+0.54 pp; +0.008). Inspection of the subject material reveals significant enhancements in challenging recordings (e.g., non-stationary sessions), indicating that RDWT can mitigate localized noise and enhance rhythm-specific information. In conclusion, RDWT is shown to be a low-overhead, architecture-aware preprocessing technique that can yield tangible gains in accuracy and agreement for deep model families and challenging subjects.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.09242

Country: Europe > Austria (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

38db3aed920cf82ab059bfccbd02be6a-Reviews.html

Neural Information Processing SystemsOct-3-2025, 09:08:34 GMT

It is know that adding an additive gaussian noise to the feature is equivalent to an l_2 regularization in a least square problem (Bishop). This paper studies multiplicative Bernoulli feature noising, in a shallow learning architecture, with a general loss function and shows that it has the effect of adapting the geometry through an l_2 regularizer that rescales the feature (beta^{\top} D(beta,X) beta). The Matrix D(beta,X) is a estimate of the inverse diagonal fisher information. It is worth noting that D does not depend on the labels. The equivalent regularizer of dropout is non convex in general.

delta, dropout, regularizer, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.04)

Genre:

Summary/Review (0.48)
Research Report > New Finding (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Process-Informed Forecasting of Complex Thermal Dynamics in Pharmaceutical Manufacturing

Rubini, Ramona, Khodakarami, Siavash, Bora, Aniruddha, Karniadakis, George Em, Dassisti, Michele

arXiv.org Artificial IntelligenceSep-25-2025

Accurate time-series forecasting for complex physical systems is the backbone of modern industrial monitoring and control. While deep learning models excel at capturing complex dynamics, currently, their deployment is limited due to physical inconsistency and robustness, hence constraining their reliability in regulated environments. We introduce process-informed forecasting (PIF) models for temperature in pharmaceutical lyophilization. We investigate a wide range of models, from classical ones such as Autoregressive Integrated Moving Average Model (ARIMA) and Exponential Smoothing Model (ETS), to modern deep learning architectures, including Kolmogorov-Arnold Networks (KANs). We compare three different loss function formulations that integrate a process-informed trajectory prior: a fixed-weight loss, a dynamic uncertainty-based loss, and a Residual-Based Attention (RBA) mechanism. We evaluate all models not only for accuracy and physical consistency but also for robustness to sensor noise. Furthermore, we test the practical generalizability of the best model in a transfer learning scenario on a new process. Our results show that PIF models outperform their data-driven counterparts in terms of accuracy, physical plausibility and noise resilience. This work provides a roadmap for developing reliable and generalizable forecasting solutions for critical applications in the pharmaceutical manufacturing landscape.

artificial intelligence, machine learning, uncertainty 0, (18 more...)

arXiv.org Artificial Intelligence

2509.20349

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.25)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of State-of-the-Art Deep Learning Techniques for Plant Disease and Pest Detection

Banerjee, Saptarshi, Mallick, Tausif, Chakroborty, Amlan, Saha, Himadri Nath, Takur, Nityananda T.

arXiv.org Artificial IntelligenceAug-13-2025

Addressing plant diseases and pests is critical for enhancing crop production and preventing economic losses. Recent advances in artificial intelligence (AI), machine learning (ML), and deep learning (DL) have significantly improved the precision and efficiency of detection methods, surpassing the limitations of manual identification. This study reviews modern computer-based techniques for detecting plant diseases and pests from images, including recent AI developments. The methodologies are organized into five categories: hyperspectral imaging, non-visualization techniques, visualization approaches, modified deep learning architectures, and transformer models. This structured taxonomy provides researchers with detailed, actionable insights for selecting advanced state-of-the-art detection methods. A comprehensive survey of recent work and comparative studies demonstrates the consistent superiority of modern AI-based approaches, which often outperform older image analysis methods in speed and accuracy. In particular, vision transformers such as the Hierarchical Vision Transformer (HvT) have shown accuracy exceeding 99.3% in plant disease detection, outperforming architectures like MobileNetV3. The study concludes by discussing system design challenges, proposing solutions, and outlining promising directions for future research.

artificial intelligence, comput mater contin, machine learning, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.32604/cmc.2025.065250

2508.08317

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.67)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Energy (1.00)
Food & Agriculture > Agriculture > Pest Control (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of Deep Learning Models for LBBB Classification in ECG Signals

Ordóñez, Beatriz Macas, Villavicencio, Diego Vinicio Orellana, Ferrández, José Manuel, Bonomini, Paula

arXiv.org Artificial IntelligenceAug-6-2025

This study explores different neural network architectures to evaluate their ability to extract spatial and temporal patterns from electrocardiographic (ECG) signals and classify them into three groups: healthy subjects, Left Bundle Branch Block (LBBB), and Strict Left Bundle Branch Block (sLBBB). Clinical Relevance, Innovative technologies enable the selection of candidates for Cardiac Resynchronization Therapy (CRT) by optimizing the classification of subjects with Left Bundle Branch Block (LBBB).

artificial intelligence, classification, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2508.0271

Country:

South America > Ecuador (0.18)
South America > Argentina (0.16)

Genre: Research Report > New Finding (0.74)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.80)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback